Database Record Duplicate Detection System using Simil Algorithm

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chapter 2 Duplicate Record Detection Using Anfis

The problem of duplicate detection is to find out whether the same real-world object is represented by two or more distinct entries in the database. Duplicate detection is otherwise known as Record linkage or record matching. It is a greatly researched topic and is of vital importance in fields such as master data management, data warehousing and ETL (Extraction, Transformation and Loading), cu...

متن کامل

Chapter 3 Duplicate Record Detection Using Ga and Pso

The present chapter extends the research discussed in chapter 2 by handling the optimization algorithms. Moises G. de Carvalho et al (2011) have proposed a genetic programming approach to record deduplication. This approach automatically proposes duplicate record detection function by combining several pieces of evidence taken from the data. This function makes it possible to identify whether t...

متن کامل

Near Duplicate Web Page Detection using NDupDet Algorithm

Web is a system of interlinked hypertext documents accessed via Internet. Internet is a global system of interconnected computer networks that serve billions of users worldwide. The huge amount of documents on the web is challenging for web search engines. Web contains multiple copies of the same content or same web page. Many of these pages on the Web are duplicates and near duplicates of othe...

متن کامل

Effective and Efficient XML Duplicate Detection Using Levenshtein Distance Algorithm

There is big amount of work on discovering duplicates in relational data; merely elite findings concentrate on duplication in additional multifaceted hierarchical structures. Electronic information is one of the key factors in several business operations, applications, and determinations, at the same time as an outcome, guarantee its superiority is necessary. Duplicates are several delegacy of ...

متن کامل

PSO Algorithm to Select Subsets of Q-Gram Features for Record Duplicate Detection

Though data quality issues arise with ever-zooming quantity of data, it is a welcome sign that of late, significant improvement has been made in data engineering. Consequently, there have been significant investments from private and government organizations in developing methods for removing replicas from the data repositories. This phenomenon has caused a significant interest among researcher...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal on Computer Science and Engineering

سال: 2018

ISSN: 2229-5631,0975-3397

DOI: 10.21817/ijcse/2018/v10i2/181002013